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Figure 1: General mathematical modelling 


1 Introduction to randomness 

1.1 Random phenomena 

A random phenomenon is a physical phenomenon in which "randomness" takes a place. 

So, what is randomness? It is something that we do not control, in the sense that it may lead 
to different outcomes or measurements of the phenomenon in what we believe are "identical" 
conditions. 

There are many keywords associated to the discussion and mathematical foundation of ran¬ 
dom phenomena: probability, chance, likelihood, statistical regularity, plausibility, ... There 
are whole books discussing and trying to explain what is the nature of chance and random¬ 
ness. It is not worth going into such philosophical depth for the practitioner. One may get 
lost into the variety of "definitions" or "trends" related to the word probability (classical, 
frequentist, axiomatic, subjective, objective, logical, ...) or statistics (frequentist, classical, 
Bayesian, decision-theoretic, ...). 

1.2 The modelling point of view 

Instead, take the modelling point of view: Each problem must be treated in its own merits, 
choosing the appropriate tools provided by mathematics. 

In general, the modelling of a real world phenomenon follows the scheme of Figure 

When randomness is present, the scheme is the same. The distinguishing feature is the use 
of the mathematical concept of "probability" (which has an unambiguous and worldwide 
accepted definition), and the solution to the problem comes usually in the form of a "prob¬ 
ability distribution" or some particular property of a probability distribution. See Figure 


1.3 Quantifying randomness: Probability 

Take a playing die, for example (Figure]^. Throwing a die is a familiar random phenomenon. 
We need the outcome to be unpredictable (thus potentially different) each time; otherwise the 
die is not useful for playing. On the other hand, the experiment is performed each time in 
identical conditions: We throw the die on the table so that it rebounds several times before 


3 















A. Alabert 


The Modelling of Random Phenomena 



Figure 2: Mathematical modelling in the presence of randomness 



Figure 3: A playing die developed to show all its faces. 

stopping. Of course, the conditions are no "truly" identical; in this case, our ignorance about 
the exact physical conditions provides the desired unpredictability, therefore the randomness. 

Suppose we examine the die, and we see that it looks new, homogeneous, balanced and 
with no visible manufacturing defect. Is there any outcome that looks more likely to appear 
than some other? If not, then it is logical that any attempt to quantify the likelihood of the 
outcomes lead to assign the same quantity to all outcomes. 

We may think that every outcome takes an equal part of a cake they have to share. Let us say, 
arbitrarily, that the cake measures 1. Therefore, every outcome has to take 1/6 of the cake. 
We say that every possible result co of the random phenomenon "throwing a balanced die" 
has a probability of 1/6. See Figure]^ 

From the probability of all possible results co ^ Cl, we can deduce (define, in fact, but in the 
only sensible way) the probability of all possible events, that is, subsets A C Cl: The event A 
takes the part of cake that its results co ^ A take in total. 

1.4 The law of Large Numbers 

The relative frequency of an event in a series of identical experiments is the quotient 

Number of occurrences of the event 
Number of experiments performed 

If 1 /6 is the probability of obtaining a 3 when tossing the die, it can be proved that the relative 
frequency of the event {3} converges to 1/6 when the number of experiments tends to infinity 

In general, the relative frequency of an event converges to its probability This is the Law of 
Large Numbers. It is a Theorem (an important one). It is not a definition of "probability", as 
it is frequently said. 
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□ 


Figure 4: A (presumed) balanced die eating the probability cake. 

1.5 Statistical inference 

We may think that a die is balanced when in fact it is not. In this case, the relative frequencies 
will not converge to the probabilities that we expect. Or, plainly, we suspect that the die is 
not balanced, and we do not know what to expect. 

In any case, the Law of Large Numbers leads to the following idea: 

1. Toss the die as many times as you can. 

2. Write down the relative frequency of each result. 

3. Construct the model of the die by assigning 

Probability of co := Relative frequency of co . 

This is Statistical Inference: We construct a model of a random phenomenon using the data 
provided by a sample of the population. 

The population here is a (hypothetical) infinite sequence of die throws. In the usual applica¬ 
tions, the population is a big, but finite, set of objects (people, animals, machines or anything), 
and the sample is a subset of this set. 

In another common (and definitely overused) setting of statistical inference, one simply de¬ 
clares the die as balanced unless the relative frequencies deviate too much of the expected 
values. If they do, then the die is declared "non-balanced". 

1.6 Probability. The mathematical concept 

We want a mapping that assigns to every event a number called "the probability of the event" 
satisfying: 

1. It is nonnegative. 

2. The probability of the whole set Q of possible results is 1. 
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3. The probability of the union of two disjoint events is the sum of the probabilities of the 
two events. 


Formally: A probability is a mapping 


P: r{n) ->[0,1] 

AI- >P{A) 


such that f’(n) = 1 and for any countable family C Q, with A, f] Aj = (D if i j, 



This definition captures perfectly the idea of the pieces of cake taken by the different events 
that we saw in Figure The extension to a countably infinite union instead of just finite 
does not harm and allows to construct a mathematical theory much more in line with the 
phenomena that we intend to model. Demanding the same for uncountable unions, on the 
contrary, would collapse the theory and make it useless. If Q is a finite set, then of course 
this discussion is void. 

Sometimes it is not possible to define the mapping on the whole collection P(n) of subsets 
of Q preserving at the same time the properties of the definition. In this case, we define it on 
a subcollection IF C T’(n) satisfying some desirable stability properties: 


1. CieF 

2. AeF ^ A^ eF 

oo 

3. {An}n CF ^ U A„eF , 

n=l 


where A‘^ Q — A is the complement set of A. 

These subcollections are called n-fields or cr-algebras. They enjoy the right stability properties 
so that the additivity property in the definition of P still makes sense. 

Probability Theory is a specialised part of Measure and Integration Theory. In general, 
a measure is a function defined on the sets of a (7-field with values in a set which is not 
necessarily the interval [0,1]. 

1.7 Drawing probabilities 

Probabilities behave like areas of planar regions. Consider Figure 

To compute the area of the region A U B, we may add the areas of A and B, and then subtract 
the area of A n B, which have been counted twice. This leads immediately to the fundamental 
formula: 


P(A U B) = P(A) + P(B) - P(A n B) . 

All usual lists of "properties of the probabilities" are trivial derivations of this formula, and 
can also be deduced from Figure It is useless to learn them by heart. 
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Figure 5: Probabilities and areas 



1.8 Conditional probabilities 


Consider the following example (see Figure]^: We have a box with five white balls, numbered 
1 to 5, and three red balls, numbered 1 to 3. We pick a ball "completely at random". What is 
the probability of drawing an even number? 

First of all, what is the probability of picking a particular ball? The expression "completely 
at random", though imprecise, is used to mean that all outcomes are equally likely, as in the 
case of the balanced die. 

We are interested in the event A = {W 2 , W 4 , R 2 }/ where W means white ball and R red ball. 
Since each of the balls in A takes g of the probability cake, we have that 


1 1 


1 

8 


3 

8 ■ 


Now suppose a ball has been picked by someone, who tell us that the ball is white. What is 
the probability that the ball carries an even number? 

In this case the possible results are W = {Wi, W 2 , W 3 , W 4 , W 5 }, all with probability g, thus the 
probability of {W 2 , W 4 } is |. The additional information has led as to change the model, and 
consequently the value of the probabilities. 

Notice that: 


2 _ 2/8 _ P(An W) 
5 “ ^ “ P(W) 
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where the probabilities in the quotient are those of the original model. 
The conditional probability of A to B is defined as 


/ b) 


P(AnB) 


In relation to Figure the conditional probability of A to B is the proportion of area of A 
inside B. 


We say that A and B are independent if the information that B has happened does not change 
the probability of A: 


P[P^ / b)=P{A). 


Equivalently, 


P(AnB) = P(A) ■P(B) . 


1.9 Random variables 

We can now step into a second level of difficulty: the concept of random variable. Let us 
consider the following example: We toss two balanced dice, and we are interested in the 
sum of the points shown. We may consider directly the set Q = {2,...,12} and assign 
probabilities to each element of Q, but this is difficult; or we may keep the model closer to the 
real experiment by defining Q = {(/,}) : 1 < i < 6, 1 < / < 6}, and think of the mapping 

Q^^{ 2 ,..., 12 } 

(z,;)' - >i + i 

If the dice really look balanced, and if it is clear that the outcome of one die does not influence 
the outcome of the other, then it is natural to distribute the same amount of the probability 
cake to every pair that means P{{i,j)} = 

This setting induces a probability Px on {2,..., 12}, which is what we are looking for: 

Px{2} = P{(l,l)} = l 
Px{3} = P{(l,2),(2,l)} = ^ 

Px{4} = P{(l,3),(2,2),(3,l)} = ^ ... ,etc 

In general, a random variable is a mapping X: Q —> R. (R can be replaced by other 
convenient sets; technically, the random variable must take values in another measure space, 
that is, a set endowed with a cr-field.) The law of a random variable is the probability Px on 
R induced by P and X as in the example. 

From the modelling point of view the law is the important thing, not Q or the mapping X 

themselves. Typically one says: "I am observing a random phenomenon following the law 
// 

From the law of a random variable one may define certain numeric values that carry some 
information, and that sometimes are all that is needed in a particular application. The most 
important one is the expectation, which is the "mean value" that a variable with that law 
will take. It can be though as the limit of the arithmetic mean of the observed values of the 
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variable when the number of observations tends to infinity. But this is again a version of the 
Law of Large Numbers, and not a definition. 

The expectation E [X] of a random variable X with law Px is defined as 

E[X] ■.= Y,k-Px{k}, 

with the sum extended over all values taken by X. The variance of X is a degree of dispersion 
of its values around the expectation, and defined as 

Var[X] :=E [(X-E[X])2] . 


1.10 The binomial law 


heaving aside the elementary "equiprobable" or "uniform" model of the balanced die, the 
most basic useful example of probability law is the one appearing in the following situation: 

Eix an event A of any random experiment. Call p its probability: P(A) = p. Repeat n times 
the same experiment, and let X be the number of occurrences of A in the n trials. The law of 
X is then determined by 

P{X = k} = k = 0 .n. (1) 


We write X ~ Binom(n, p) and say that X follows a binomial law with parameters (n, p). 

The sentence "repeating n times the same experiment" means in particular that one experi¬ 
ment may not influence the result of another, and therefore events concerning the outcome or 
one experiment are independent of events concerning the outcome of the other experiments, 
in the sense of section 1.8 This fact is key in the deduction of formula Q. 


2 Examples from daily life: Arrivals and waiting lines 


2.1 The geometric law 


Assume the experiments of Section 1.10 are performed continuously and at regular unit time 
intervals. We want to know the time elapsed between an occurrence of A and the next 
occurrence of A. Or, in other words, how many experiments are needed before observing 
again the event A. 


This is a situation that may be of interest in manufacturing, where the event A is the occur¬ 
rence of a defective item in the production line. 


Eet N be the number of A‘^ occurrences before the next occurrence of A. Then it is easy to 
deduce 


P{N = k} = {l-pf-p, k = 0,1,2,... 


We write N ~ Geom(p) and say that N follows a geometric law with parameter p. 
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2.2 Tails and the memoryless property 

Once we know the density function (or probability function) k i— P{N = k}, we can com¬ 
pute, as in the case of the die, the probability of any event P{N G B}, where B is any subset 
of IN. In particular, we can compute the right and left tails of the law: 

P{N >k} = {l- , P{N <k} = . 

Because of the (hypothesized) independence between the experiments, the law of N is the 
same if we define N as the number of occurrences before the first occurrence of A. From 
this fact one can prove the memoryless property: 

P|N>m+k / ^ m} = P{N > k} . 

In words, knowing that the event has not appeared in the first k experiments, it is not more 
or less likely to appear than if we just start now the sequence. 


2.3 Arrivals at random times: The Poisson law 


Assume now that the arrivals occur at random times instead of regularly. For example, the 
arrival of customers to a waiting line may correspond to this situation. To be precise, assume: 


1 . 

2 . 


People arrive alone (never in groups). 

The probability p that an arrival occurs during a time interval of length h (small) is 
proportional to h: 

p = \ -h 


3. The number of arrivals on disjoint time intervals are independent random variables. 


We would like to know, for instance, the law of the number of arrivals Nt in the interval 
[0, f], or the number of arrivals per unit time. The hypotheses above are quite suitable for a 
situation where the arrivals can be considered "completely at random". 

Of course, hypothesis 2 can only hold true in an infinitesimal sense. Strictly speaking, one 
should say lim;,^o p/h = A. 

Now, divide [0, t] in intervals of length h = t/n. For n big enough, inside each interval we 
will see at most one arrival, and this will happen with probability Ah. Therefore, the number 
of arrivals in [0, t] follows approximately a law Binom(n, Af/n). Hence, by Q: 


P{k arrivals in [0,f]} = Q . . 

Taking n —>• oo, 

P{k arrivals in [0, t]} = -^4^ exp{—Af} . (2) 

K\ 

Let N be the number of arrivals per unit time. We write N ~ Pois(A) and say that N follows 
a Poisson law with parameter A: 


P{N = k} = — exp{-A} . 

The parameter A is called the traffic intensity. 
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2.4 Interarrival times: The exponential law 

Let T be the time between two arrivals. As in the case of the geometric law, this random 
variable is equal in law to the time when the first arrival takes place. The event {T > t} 
means to have 0 arrivals in [0, f], whose probability according to (|^ is exp{—Af}. 

We observe that this probability is nonzero for all t > 0, and that it cannot be expressed as 
the sum of the probability of elementary events. We say that the interarrival times follow a 
continuous law, in contrast with all laws seen so far, called discrete laws. 

In the case of continuous laws, the density is a function /: R —^ R+ such that P{T G [fl,I’]} 
is the area under its graph between a and b. 


P{Te[a,b]}= f f 

J a 

To compute the density of the interarrival times, we observe that 

[ / = P{T e [0,f]} = 1-exp{-Af} , 

Jo 

so that 

/(f) = A ■ exp{—Af} 

T ~ Exp (A) is called the exponential law with parameter A. 

2.5 Continuous laws 

Continuous laws have some features that contrast with those of discrete laws: 

• The law is not determined by the probability of the individual outcomes. 

• It is the density that determines the law. (This can be said to be true also for discrete 
laws, but the concept of "density function" is different.) 

• It is not possible to assign a probability to all subsets of the real line (this is not obvious). 
But we do not need to! It is possible to assign a probability to all intervals, and therefore 
to the members of the minimal u-field containing the intervals, which is far more than 
what we need from a practical point of view. 

• Continuous laws show why we cannot ask a probability to be additive for collections of 
arbitrary cardinality. For example: 1 = P{T > 0} 7 ^ Ilt>oP{T = f} = 0. 

• The expectation of a variable with a continuous law cannot be defined with sums. It is 
the integral 

/ CO 

xf{x)dx , 

-CO 

where / is the density. Notice however the analogy with the definition for discrete laws. 
In the context of measure theory, the expectation can be expressed in a unified way for 
all cases. 

The correct name of these laws is absolutely continuous, for mathematical consistency, but 
the adverb is frequently dispensed with. "Continuous", strictly speaking, simply means that 
the so-called distribution function F{x) := P{X < x}, which is always non-decreasing and 
right-continuous, is furthermore continuous; whereas "absolutely continuous" refers to the 
stronger property that the distribution function is a primitive of another function, the density: 

Hx) = fLf- 
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Arrivals- 

15 

10 

5 ■ 


^^^ -'-'-^-> Time 

0 5 10 15 20 

Figure 7: A Poisson sample path with A = 1 (red) and with A = 0.5 (blue). Lower A means less 
frequent arrivals in average. 

2.6 Poisson arrivals / Exponential times 

Still some remarks about the relation between the Poisson and the exponential laws: 

1. If the interarrival times are Exp (A), then the arrivals per unit time are Pois(A). 

2. This situation is called "completely random arrivals", in the sense that the arrival times 

0 < ti < t 2 < ■ ■ ■ < < t have the law of k independent uniformly distributed values 

in [0, t], after sorting them. 

3. The exponential laws enjoy the same memoryless -property as the geometric law, 

P{T>t + s / j^^} = P{T>t} , 

and is the only continuous law with this property. It is a good model for lifetimes of 
"ageless devices"; for instance, the lifetime of an electronic device, or living beings in 
their middle ages, when the death comes from internal or external accidents (electric 
shocks, heart strokes,...). 

2.7 The Poisson process 

The collection of random variables {Nt, f > 0}, counting how many arrivals have occurred 
in the time interval [0, t], form the Poisson process. 

When we observe a particular arrival phenomenon, we see, as time passes, a sample path of 
the Poisson process (see Figure]^. We may also think of the Poisson process as the collection 
of all its sample paths. 
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McCALLISTER 


Figure 8: A t 5 rpical simple queue: Customers arrive, wait in a line, are served, and leave the system. 
(Illustration appeared in The New Yorker, 1977) 


2.8 Stochastic processes 

In general, a random evolution in time is modelled by a stochastic process. There are two 
possible points of view of a stochastic process: 

1. As a collection of random variables: 

X := {Xt, f > 0} , with Xf: Q —R . 


2. As a "random function" 


X: Q->R^^ 

CO I- >X{to) 

Here R'*^^ denotes the set of all functions R^ —?■ R, which can be identified with the Cartesian 
product of "R+ copies" of R as a set, as a topological space and as a measure space. 

2.9 Queues (waiting lines) 

A queue is a situation in which users arrive to a service, wait to be served if the service is 
not immediately available, and leave after having been served (Figure]^. 

Examples are customers in a supermarket cash, cars in the highway at the toll booths, and 
parts in a manufacturing chain. 

Its behaviour depends, among other things, on: 

1. Arrival pattern: Interarrival times, number of users per arrival, patience of the cus¬ 
tomers, ... 

2. Service pattern: Service time, number of users served simultaneously, ... 

3. Queue discipline: FIFO (First-In, First-Out), FIFO (Fast-In, First-Out), SIRO (Service in 
Random Order), ..., with variants specifying priorities, pre-emption, etc. 

4. Capacity: Number of users allowed to wait. 

Moreover, everything may be dependent or not on the state of the system (number of users, 
etc.) and the elapsed time since the start. 

Typical questions posed in these situations are: 
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• How many users are in the line? (at a given time, in the mean, ...) 

• How long a user must wait? (a given user, in the mean,...) 

• How much time a service facility is idle? 

• How long are the busy/idle periods of the service facility? 

The answers are random variables if at least one of the features is random. We would like to 
know the law of these variables, or at least its expectation, or some other value of interest. 

The purpose of knowing these laws or law parameters is, frequently, to take a decision about 
some controllable inputs of the queue, and with some cost associated to each of the values 
of these inputs. For instance, the number of cashiers in a supermarket clearly influences the 
waiting time of the customers; benefits may increase thanks to that, but the running costs are 
also higher. Here we enter the realm of optimisation and operations research. 


2.10 The M/M/1 queue. Transition probabilities 


Assume that we have Poisson arrivals to a queue, the service time is also random and follows 
an exponential law (just one among some common situations), and there is a single service 
channel (only one user at a time is served). 


More precisely, we now put in rigorous mathematics the hypothesis of Section 2.3 In the 


sequel we use the usual notation o{h) to mean any function such that lim;,^o= 0 - 
Assume that the arrivals satisfy: 


1 . P{more than one arrival in [t,t + h]} = o{h) 

2. P{an arrival occurs in [t,t -\-h]} = Ah + o{h) 

3. The number of arrivals in non-overlapping time intervals are independent random vari¬ 
ables. 


And moreover the service times satisfy 

1. P{more than one service completed in [t,t + h]} = o{h) 

2. P{a service is completed in [t,t + h]} = jih + o{h) (assuming the service is not idle). 

3. The number of completed services in non-overlapping time intervals are independent 
random variables. 


All these properties together imply that we have a queue where the interarrival times follows 
the law Exp (A) and the service times follow the law Exp{ji). 

Assume, moreover, than jointly considered, arrivals and services are independent. 

Let us call now Nt the number of users in the system at time t. We can compute the prob¬ 
ability that the state of the system changes from n users to any other number in some time 

interval [t, t + h]. These are called the transition probabilities, and can be considered for any 
stochastic process. It is easy to find, using the hypotheses above that for all n > 1 

a) P{hJt+h = + 1 / ]Vf = n} = A/j -|- o{h), for n >0. 

b) P{Nt+h = n — 1 / = Yi'j = jih + o{h), for n > 1 . 

c) P{hJt+h = n / =1 — (A-|- }i)h + o{h), for n >1, and 

P{Nt+h = 0 / ^ Q'j = I - \h + o{h). 

d) All other transition probabilities are o{h). 
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2.11 The M/M/1 queue. Differential equations 

Fix two times s <t. Denote Pnm{s, t) the conditional probability of being in state m at time t, 
conditional to be in state n at time s. Then, for m > 0, 

p„m{s,t + h)= p„k{s,t) ■ pkm{t,t + h) 

keK 

— Pnmis,t) ■ Pmm (t, t -\-h) + Pn,m-l{s, i) ■ Pm-l,m{P t + h) + Pn,m+l{s, t) ' Pm+l,m{P t + h) + o{h) 

= Pnm{s,t) ■ (1 - {X + p)h + 0{h)) +Pn,m-l{s,t) ■ {Xh + o{h)) +p„,m+l(s,f) ■ {ph + o{h)) + o{h) 


Diving by h and taking Iz —> 0, we obtain 

^Pnmis,t) = -{X + p)pnm{s,t) + Xpn,ni-lis,t) + ppn,m+l{s,t) . 
Analogously, for m = 0, one finds 

^Pno{s,t) = -Xpno{s,t) +}ipn,l{s,t) • 


This is a countably infinite system of ordinary differential equations for the conditional prob¬ 
abilities Pnm{s,t) := = ^ / ]Vs = n}/ for s < t, and n,zw G N. 

One can also obtain differential equations for the law of Nt itself: Denote pn{t) = P{Nt = n}. 
For n > 0, 


d 

dt 


Pn{i) 


E Pk{0)Pkn{0,t 

keN 


E Pk{0) - {X + p)pkn{0,t) + Xpk,n-l{0,t) + ppk,„+l{0,t) 
keK ^ 

— (A -|- p)p„{t) -|- Xp„-i(t) + ppn+l{t) . 


And, for n = 0, 

-po(f) = -Xpo{t) + ppi{t) . 

We get again a countably infinite system of ordinary differential equations. The system can 
be solved exactly but it is difficult and there is a lot of higher mathematics involved. 


2.12 The M/M/1 queue. Steady-state law 

In the long run, as t grows, does the law of Nt stabilises? If this is true, then the derivatives 
in the system of Section 2.11 must vanish when t —)■ oo: 


0 = -Apo + PPi 

0 = -(A p)pn + Ap„_i pp„+i . 


By induction. 


Pn = 


A\ ” 


P 


■Po 


Using as boundary condition p^ = 1, we obtain 

1 

Po “ TaW ' 
Uj 
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hence a necessary condition for the existence of a stabilisation is A < fi. Denote p := A/ji. 
This number is called the traffic intensity of the queue. 

If > 1/ no steady-state exists; in fact, the queue tends to grow forever, as more and more 
users accumulate in it. 

If, on the contrary, p < 1, then po = 1 — p, and we get 

Pn =p”{l-p) , 

which is the probability of having n users in the system, in the long run. 

Knowing the law of the number of users in the system in the long run, it is easy to compute: 


• The expectation of the number of users N in the system: 

E|~l = ^ . 

• The expectation of the number of customers Nq in the queue: 



1-p ■ 


• The law of the waiting time Tq in queue: 

P{T, = 0} = l-p. 

P{Tq < f} = 1 — pexp{—^(1 — p)t} (for t > 0). 

• The expectation of Tq-. 

A 

p{p-A) ■ 

2.13 Complex queueing systems. Simulation 

The results above are specific of the M/M/1 queue. There are specific results for other types 
of queues, and there are also some general results. For instance, the relations 

E[N] = AE[T] 

E[N,]=AE[T,] 

which one can easily deduce in the M/M/1 queue, are true, no matter the law of arrivals and 
service times. 

However, except for relatively easy queue systems, there is no hope to find analytical results, 
as computations become intractable very soon. That means that in the real world, one can 
hardly find closed formulae. 

What to do then? One may propose: 

• Idea 1: Observe the system long enough, take data and do some sort of statistical infer¬ 
ence. 

• Idea 2: Simulate the system in a computer, and do statistical inference as well. 
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For idea 1 to work, we need the system really running, some mechanism of observation, and 
a lot of time. In practice, we seldom can afford such luxuries. For idea 2, on the other hand, 
we only need, essentially, a mechanism to generate random numbers. 

There are very good random number generators embodied in software. Their outcome is 
not really random, but they can fool any detector of "non-randomness". Anyway, if the 
quality of a stream of such pseudo-random numbers is a concern, it is very easy to use a 
true random number generator based in hardware: Nowadays, several internet sites offer 
potentially infinite streams of true random numbers produced by a quantum device. And 
such devices are quite cheap, in fact. 

2.14 Birth and death processes 

A birth and death process Nf takes values in IN and the change across an infinitesimal time 
interval can only be -1, 0, +1: 

P{^t+h + = o{h) 

P[Nt+h = n-l I + o{h) 

This is a generalisation of the M/M/1 queue model to transition probabilities that may de¬ 
pend on the system state. 

The corresponding system of differential equations for the state of the system becomes 

= (Tfj -|- T T Pn+lPn+l{f) 

= -Aopo(f) +FiFi(0 

Birth and death processes have been used, for example, to model the varying size of a bio¬ 
logical population under given environmental conditions, or to describe the evolution of an 
epidemic. 


3 Example from industry: Inventories 



Figure 9: A warehouse 
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3.1 Inventory modelling 

A company distributes some product, maybe after processing some raw material that arrives 
to the warehouse. Let us assume that we are dealing only with one product and no processing 
time. Assume also that the product has an approximately constant level of demand, but the 
arrival of orders from the clients is not so predictable. The time required to obtain units of 
product from the manufacturer is also subject to some variability. 

Two fundamental questions in this situation are: 

1. When should more items be ordered? 

2. How many items should be ordered when an order is placed? 

A couple of things to take into account: 

• If a customer wants to purchase but we do not have items, the sale is lost. Therefore, it 
is important to have enough items in the warehouse. 

• The product may become obsolete, and there is also a cost of maintaining the inventory. 
Therefore, it is not good to keep in storage too many items. 

Simple hypothesis for an inventory problem that allow analytical computations similar to the 
M/M/1 queue are: 

• Orders arrive for single items with a random interarrival times following the same law, 
independent from each other. 

• The time to receive items from the manufacturer (lead times) follows some law, and are 
independent, and independent of order arrival. 

A commonly used simple strategy is the (r,s) -policy: when the inventory drops to r units, 
order s — r units. One may measure the performance to this policy, given r and s by the 
average inventory level, or by the average no-inventory time, or by the number of orders that 
arrive when the inventory is broken, or, most probably, by an combination of these and other 
measures that ultimately reduces to a measure of economic benefit that the company wants 
to maximise. 

The inventory process, whose paths have the aspect of Figure is not in general a birth and 
death process: Items may arrive in batches to the warehouse, and the clients' orders may also 
involve more that one unit. It is therefore a generalisation of the situations seen in the last 
sections. But it can still be simulated easily if we know the input distributions. 

3.2 Markov chains 

We further generalise by abstracting a remarkable property of the inventory process: If we 
know the state of the system at a particular time t, we do not need to know anything about 
previous states to predict the future. This is a random analogous of the uniqueness property 
of deterministic dynamical systems when proper initial conditions are given. 

Formally: If fi < ■ ■ • < < L 

P{Nt = n / = m, ..., Nt, = nj = P{^t = n / ^ 

Stochastic processes satisfying this property are called Markov chains, and enjoy an extensive 
an quite rich theory. 
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Figure 10: A path of a inventory process. For some time before day 20 and around day 30 the inventory 
was "empty". 


3.3 Chapman-Kolmogorov equation 


Consider times 0 < u < t < s. Recall the notation of Section 2.11 for the transition probabili¬ 
ties. 


The Chapman-Kolmogorov equation for Markov chains establishes that the probability of 
going from state n to state m when time runs from w to s can be computed by decomposing 
all possible paths at the intermediate time t: 


Vnm{u,s) = J^Pnk{u,t)pkm{Ts) . 
k 


We have already used this in Section 2.11 


In particular, the law of the random variable Nf^jj can be obtained from the law of Nt and the 
transition probabilities from t to t + h: 

Pnm{0,t + h) = Y^p„k{0,t)pktn{t,t + h) 

k 

nm (0, t h) — EEp n{0)pnk{^/ l)Pkm{l-/ 1 “1“ 
n k n 

Pm{t + h) = J^Pk{t)pkm{Pt + h) . 

k 


3.4 Kolmogorov forward and backward equations 

Assume 


1 - Pnn{t,t + h) = qn{t)h + o{h) 

p„m{t,t + h) = q„m{t)h + o{h) , (for n f m) 

for some continuous functions qn and qnm- Then, the following two relations hold: 

3 

^Pnm{u,t) = qm{t)pnmiu,t) + ^ Pnk{u, t)qkj{t) 

k^m 
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— ^n{^)Pnm{^/1) T"! hnkjl^Pkmi^/1) 

kjtn 

These differential equations for the transition probabilities are known as Kolmogorov equa¬ 
tions, forward and backward, respectively. 

3.5 Differential equations for the laws 

Assume that the functions q„ and q„m above are constant: q„{t) = q„ and qnm{t) = qnm- The 
Markov chain is then called time-homogeneous. 

From Kolmogorov forward equations, letting u = 0, multiplying by p„ (0) and summing over 
n, one obtains a (infinite) system of differential equations for the laws of Nt: 

= qmPm{t) + Pk{l)^kj 

k^m 


3.6 Long-run behaviour of Markov chains 

In many applications it is of interest to study the behaviour of the chain in the long run. For 
instance: 

• Limiting distributions: Assume that the limits limf^oo Pnm {u, t) exist and are equal, for 
all n. That means, the limit is independent of the initial state, when time is large. The 
limit is a probability law called the limiting or steady-state distribution of the Markov 
chain. 

• Stationary distributions: If the limit of the laws {limt^oo Pn{i)}n exists, it is called the 
stationary distribution of the chain. If there is a limiting distribution, then it coincides 
with the stationary distribution. But the latter may exist independently. 

• Ergodicity: Loosely speaking, ergodictiy means that some kind of information that 
can be extracted from a process as a whole, can also be obtained by observing one 
single path. For instance, ergodicity with respect to the expectation means that the limit 
limt^oo E[X(f)] coincides with 

1 rt 

lim - / X(s) ds 
t^co t .Jo 

for all sample paths X(s). For example, the M/M/1 queue, with traffic intensity p < 1, 
satisfies this property. 

In particular, ergodicity implies that simulating one only sample path for long enough 
time is sufficient to estimate the expectation of the process in the long run. 

• Classification of states: The elements of the state space of Markov chains are classified 
according to different interwoven criteria. Among the most important concepts: A state 
is transient if the probability to never returning to it is positive; otherwise it is called 
recurrent, and the process will certainly visit that state an infinite number of times; a 
state is absorbing if the chain never leaves it once it is reached. 
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Figure 11: A time series: A discrete time stochastic process with 144 values corresponding to the 
number of airlines passengers (in thousands) between 1949 and 1960. 

3.7 Stochastic processes in discrete time 

A discrete time stochastic process is a process where the family of random variables is in¬ 
dexed by a discrete set, usually Z or N. 

A discrete time Markov chain has the same definition of a Markov (continuous time) chain, 
except that the index t runs over a discrete set, usually the non-negative integers. 

Another important class of stochastic process in discrete time is the time series, that models 
a different sort of dependency between variables. Figure shows the monthly evolution of 
the number of passengers of international airlines between January 1949 and December 1960. 
One observes a trend (increasing), a seasonality (peaks at the central months of the year) and 
a residual noise (the purely random component of the process). Usually, one tries to fit a 
suitable model of dependence between the variables, so that the original process is expressed 
as the sum of these individual components. 


4 Example from biology: genes 

4.1 Genotype and gene frequencies 

Alleles are several forms that a gene may have in a particular place (locus) of a chromosome. 

For example, sheep haemoglobin presents two forms, produced by two alleles, A and B, of a 
certain locus. Each individual possesses chromosomes in pairs, one coming from each parent. 
This implies that there are three possible genotypes: AA, AB, BB. 

Typically, one allele is dominant, while the other is recessive. The recessive allele shows up 
externally in the phenotype only if the dominant is not present. 
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Figure 13: Some more sheep! 

Assume we extract blood from a population of N sheep, and the genotypes appear in pro¬ 
portions Paa, Pab and Pbb (called genotypic frequencies). The gene frequencies are the 
proportions of the two alleles: 


Pa ■= Paa + ^P^b 
Pb ■= Pbb + 2^ab 


(3) 


4.2 Hardy-Weinberg principle 

Assume that: 

• The proportions are the same for males and females. 

• The genotype does not influence mating preferences. 

• Each allele of a parent is chosen with equal probability 1/2. 


22 










A. Alabert 


The Modelling of Random Phenomena 


Then, the probabilities of each mating are, approximately (assuming a large population): 

P{AA with A A) = Pf^ 

P{AB with AB) = Pfg 
P{BB with BB) = P|b 
P{AA with AB) = IP^aPab 
P{AA with BB) = IP^^Pbe 
P(^AB with BB) = TP 

We can deduce easily the law of the genotypes for the next generation: 

Qaa = Paa + 2^PaaPab + -^Pab = Pa 

Qbb = Pbb + 2^PbbPab + ^Pab = Pb 
Qab = '2-PaPb 

Computing the gene frequencies Qa and Qg with 0 we find again Pa and Pg, so the genotype 
frequencies must be constant from the first generation onwards. This is the Hardy-Weinberg 
principle (1908). 

As an application of this principle, suppose B is recessive and we observe a 4% proportion 
of individuals showing the corresponding phenotype. Then we can deduce the genotype 
proportions of the whole population: 

4% = Pgg = Pi ^ Pg = 20%, Pa = 80%, Paa = 64P/o,Pab = 32% . 

If the population were small, then randomness in the mating may lead to genetic drift, and 
eventually one of the alleles will disappear from the population. The other gets fixed, and 
time to fixation is one of the typical things of interest. This purely random fact explains the 
lost of genetic diversity in closed small populations. 

4.3 Wright-Fisher model (1931) 

If the mating is completely random, it does not matter how the alleles are distributed among 
the N individuals. We can simply consider the population of 2N alleles. 

Assume that at generation 0 there are Xq alleles of type A, with 0 < Xq < 2N. We pick alleles 
from this population independently from each other 2N times to form the N individuals of 
generation 1. 

The law of the number of alleles of type A must be Binom(2N, p), with p = Xo/2N. Thus 

P{Xi=k} = (Xo/2N)*^(l - Xo/2N)™-^ k = 0,...,2N . 

In general, the number of alleles A in generation n + 1 knowing that there are j in generation 
n is 

P{X„+i =k/2c„=j}= {i/2N)Hl - j/2N)^^-^ . 

This defines a Markov chain in discrete time. Its expectation is constant, E[X„] = E[Xo], and 
the expectation of the random variable X,„ knowing that a past variable X,„ (m < n) has taken 
value k, is equal to k: 

E[Xn / x,„=k]=k, k = 0,...,2N. (4) 

However, as we saw in Section [4^ the process will eventually reach states 0 or 2N, and it will 
remain there forever. They are absorbing states (see Section [T^. 
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4.4 Conditional expectation 

The expression on the right-hand side of (Q is called the conditional expectation of X„ given 
that Xm = k. It is exactly the expectation of X„ computed from the conditional probability to 
the event {X,„ = k}. One may write Q as 

E [^" / X„] = X„ 

and the left-hand side is now a random variable instead of a single number, called the con¬ 
ditional expectation of given X,„. For each a; G Q, the random variable E [^ / y] (a;) is 
equal to the number E / y = y], if Y(a;) = y. 

In case the conditioning random variable Y has a continuous law, the definition above does 
not work, since {Y(a;) = i/} is an event of probability zero. The intuitive meaning is however 
the same. Mathematically, the trick is not to consider the y individually, but collectively: The 
conditional expectation E [^ / y] is given a sense as the (unique) random variable that can 
be factorized as a composition (cpoY) (m), with (p: R —> R, and whose expectation, restricted 
to the events of Y, coincides with that of X: 


E[((p o Y) ■ l{ygB}] - E[X ■ IjygB}] , 
where IjygB} is equal to 1 if Y(a;) G B and 0 otherwise. 


4.5 Continuous approximation of discrete laws 


Discrete laws involve only elementary discrete mathematics, but they are sometimes cum¬ 
bersome with computations. For instance, computing exactly the probability density of a 
Binom(n, p) distribution when n is large involve making the computer work with higher pre¬ 
cision than usual. Although nowadays this is not a big deal (unless n is really very large), 
it is still useful, and conceptually important, to use continuous laws as a proxy to the real 
distribution. 

Specifically, for the binomial law: If X ~ Binom(n, p), then 

X np — ^ ]V(o, 1) , (approximately, for n large) 

7np(l -p) 

where N(0,1) denotes the so called Normal (or Gaussian) law with expectation 0 and vari¬ 
ance 1. Its density function is the Gaussian bell curve 


fix) 



Figure]^ shows graphically the approximation. 

The importance of the Gaussian law comes from the Central Limit Theorem, which explains 
its ubiquity: If X„ is a sequence of independent identically distributed random variables, with 
finite variance, and S,, ■= then 


Sn - E[s„] 
^yVar[S„] 


converges in law to N (0,1) . 


We immediately see that the binomial case above is a particular application of this theorem, 
taking X, ~ Binom(l, p), which implies S„ ~ Binom(n, p). Convergence in law is a non¬ 
elementary concept that has to do with duality in functional spaces: Suppose that {Y„} is a 
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Figure 14: Approximation by the Gaussian law: The probability density of Binom(M = 45, p = 0.7), 
after subtracting its expectation (centring) and dividing by the square root of ifs variance (reducing), 
depicfed wifh black verfical lines. In red, fhe densify funcfion of N(0,1). 

sequence of random variables with respective distributions P„, and Y is a random variable 
with distribution P. Then, we say that {Y„} converge in law to P if for every bounded 
continuous function /: R —)■ R, 


\imE\f{Y„)]=E\f{Y)]. 

The seemingly natural "setwise" convergence lim„^ooPH(A) = P(A) for all sets A is too 
strong, and will not work for the purpose of approximating by continuous distributions. 

One practical consequence of the Central Limit Theorem for modelling is that any phe¬ 
nomenon whose result is the sum of many quantitatively small causes (like for instance the 
height or the weight of a person) will be well described by a Gaussian random variable. The 
fact that the Gaussian laws may take values in any interval of the real line is not an obsta¬ 
cle due to the rapid decrease of the bell curve: Outside a short interval, the probability is 
extremely small. 

4.6 Random walk and the Wiener process 

Let {X„}„ be a Markov chain taking values in Z with 

Xo = 0 

P{X„+i = f + l/x„ = 0 = 1/2 
P{X„+i =f-l / = 0 = 1/2. 


This process is called random walk. It is simply a "walk" on the integer lattice, where at each 
time step we go to the left or to right according to the toss of a fair coin. In other words, the 
increments e„ := X„ — X„_i are independent and take values 1 and -1 with probability 1/2. 
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Define a sequence of continuous-time process by renormalisation of a random walk: 






1 [Nt\ 

7n £ ‘‘ ■ 


By the Central Limit Theorem, 


I [NfJ 

= _ Ejt converges in law to N(0,1) , 

vNf k=l 

hence the sequence { converges in law to a random variable Wt ~ N{0,t ), the Gaussian 

law with variance t, for all t > 0, whose density is 


/(^) 



Analogously, Wf’ — converges in law to Wf — Ws ~ N{0,t — s). The limiting process Wt 
satisfies: 


1. The increments in non-overlapping intervals are independent. 

2. The expectation is constant equal to zero. 

3. The sample paths are continuous functions. 

4. The sample paths are non-differentiable at any point. 

W is called the Wiener process. In fact, a Wiener process is defined by its laws, but usually 
it is additionally asked to have continuous paths. This particular construction as the limit of 
random walks leads indeed to continuous paths. 

The Wiener process is also called Brownian Motion in the mathematical literature. However, 
the Brownian motion is a physical phenomenon, and the Wiener process is just a mathematical 
model (and not the best one) to that phenomenon. 


4.7 Diffusion approximation of Wright-Fisher model 

The Markov chain of the Wright-Fisher model is too complicated to work upon. Instead, 
define Then, 

E - r,")" / Y« = x] 

The limiting process Yf exists, satisfies 

E [{yt+h - / Y^ = x\= hx{l - x) o{h) 

and it is called the diffusion approximation of the original Markov chain. 
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4.8 Diffusions 


A diffusion Y is a continuous-time Markov process, with continuous paths, and such that 


1. E[yt+h - Yf / ^ = b{t,x)h + o{h) 

2. E[0^t+h - I Yt = x\= a{t,x)h + o{h) 


for some functions a and h. See Section 4.4 for the interpretation of the conditional expecta¬ 
tions when the conditioning variable is continuous. 


Under mild conditions, Yf has a continuous law with density f{t,x) satisfying the Kol¬ 
mogorov forward and backward equations: 




The Wright-Fisher model can be expanded to take into account other effects in population 
dynamics, such as selection or mutation. This complications make even more useful the 
corresponding diffusion approximations. 


5 Example from economy: stock markets 



Figure 15: Kuwait stock market 


5.1 A binomial economy 

Assume an economy with only two states: 

• Up (with probability p) 

• Down (with probability 1 — p) 

Assume that there are two assets: 
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• A risk-free bond with interest rate R, and 

• A share with price S(0) at time 0 and S(l) at time 1, given by 


S(l) = 


S(0)m , 
S(0)d, 


if the economy is "up" 
if the economy is "down" 


A trading strategy for a portfolio is defined by 


• Bo € allocated to the bond, and 

• Aq quantity of shares of the stock 


at time zero. The values of the portfolio at times 0 and 1 are 

y(0) = Bo + AoS(0) 
y(l) = Bo(l + R)+AoS(l) 


5.2 Free lunch? 


As we will see, one can make money for free, unless d < 1 R < u. An arbitrage opportunity 
is the situation in which, without investing any money at time zero, the probability to have a 
positive portfolio at time one is positive, and the probability of a loss is zero. 

For a couple (Bq, Aq) such that V(0) =0, 


y(l) -Bo(l+R)+AoS(l) = 


Bo(l + R)+AowS(0) 
Bo(l + R)+AodS(0) 


AoS(0)-[u-(l + R)] 
AoS(0)-[d-(l + R)] 


with respective probabilities p and 1 — p. 

If (1 + R) < d, both quantities are positive and we could borrow money to buy assets to have 
a sure win. If (1 + R) > w, both quantities are negative and we could make money by selling 
assets and buying bonds. If V (0) 7 ^ 0, the argument is equally valid. 

The arbitrage situation is not realistic if all the actors have complete information. Thus, 
usually there is no free lunch! 


5.3 European options 

An European call option is a financial derivative: It gives the holder the right (not the obli¬ 
gation) to buy a share for an pre-specified amount (exercise price K) on a specific later date 
(expiry date T). Similarly, an European put option is the right to sell the share. 

If S(T) is the value of the share at time T, the payoff of a call is (S(T) — R)^. If S(T) < K, 
the holder does not exercise the option, since it can buy the share in the market for a cheaper 
price, so the payoff is never negative. 

Correspondingly, the payoff of a put is (R — S(T))^, see Figure 16 
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Figure 16: Graphs of an European call and an European put 


5.4 Fair price of an European call option. Example 

Assume the following data: 

• Current price of the share: S (0) = 100 

• Interest of the risk-free bond: 10% 

• Possible prices for the share at time 1: 120 or 90 

• Exercise price: K = 100 

We have u = 1.2, d = 0.9, 1 + R = 1.1. The payoff will be Cj, := 20 or := 0. 

To find the fair price, let us construct a portfolio with a value V{1) equal to the payoff of the 
option. The fair price will be V (0). 

^ f Bo-l.l + AoS(0)-1.2 = 20 
\ Bo ■ 1.1+ AoS(0)-0.9 = 0 
^ AoS(O) = 66.67 , Bo = -54.55 

The fair price is thus 12.12 


5.5 Fair price of an European call option. In general 


In general, we have 


y(i) = 


Bo(l “F R) + AqS[0)u — Cu 
Bo(l+R)+AoS(0)d = Cd 


=> Bo 


mCj; dCu 
{l + R){u-d) ' 


AoS(O) = 


Cu — Cd 

u — d 


^ Bo + AoS(O) = (1 + R)-i {Cuq + 0^(1 - q)) , 


where q = 


1 + R-d 
u — d 


It follows that the fair price of the option is the expected (and discounted!) payoff of the 
option under the probability Q = {q,l — q) for the states of the economy: 

Eq[(1 + R)-1(S(1)-R)+] (7) 

Some remarks on the probability Q: 
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• Under Q, the share and the bond have the same expected return: 


Eq[S(1)] = S{0)uq + S{0)d{l - q) 
= S{0)(u 


u — d 
= S(0)(1 + R) . 


1 + R — d m — 1 — R 
+ d 


u — d 


• The probability Q does not depend on the underlying probability P = (p, 1 — p) nor on 
the payoff of the option. 

• Q is called the risk-neutral probability (or martingale probability). 


5.6 Fair price of an European call option. Example (cont.) 


With the same data as before, we compute now the fair price directly using formula Q, where 
in this case q =: 2/3. 


Eq [(1 + R)-i (S(l) - R)+] = (1 + R)-i {Cu -q + Q-il-q)) 

1 r 2 ii 

= ^ [20 ■ - + 0 • - 


= 12.12 


Assume now that the exercise price is fixed to R = 95 instead of R = 100, while all other 
data remain the same. Logically, the option should be more expensive in this case. Applying 
again formula (0, 


£q[(1 + R)-1(S(1)-R)+] 


(1 + R) ^ [Cu ■ q + C4 ■ {1 — q)) 



15.15 . 


5.7 European call option. Multiperiod 

The previous sections dealt with a single time period. Assume now that the expiry time of the 
option is T and that we can change the composition of the portfolio at any of the intermediate 
integer times. 

A trading strategy is then 0 < f < T — 1}. It is called a self-financing strategy if 

we do not put new money or take money out of the portfolio. 

At time t, we can change the portfolio composition, but the value remains the same: 


Bt + AtS{t) — Bf+i + At_|_iS(f) . 


The new value at time f + 1 will be: 

Lt+i(l + R) + At+iS(f + 1) . 

Therefore, the value increments for a self-financing strategy is 

V{t + 1) - y(f) = Bt+iR + At+1 {Sit + 1) - S{t)) . 
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We can compute the fair price F(0) at time 0 of an option with exercise value K at expiry date 
T recursively: 


F(T-l) 

F(T-2) 


Eq 

Eq 


(l + R)-i(S(T)-K)+ /s^T-l) 
(1 + R)-1E(E-1)/s(T-2)' 

(1 + R)-2(S(T)-X)+/s(p_2) 


F(0) =Eq [(l + R)-^(S(T)-fC)+] . 


This computation uses essential properties of the conditional expectation that we are not 
going to detail here. But the conclusion must be quite intuitive. 


5.8 Martingales 


Under probability Q, the stochastic process (1 + R) ^S(t) enjoys the martingale property. A 
stochastic process {Xf,t > 0} is a martingale if 


E 



= X, 


whenever s < t, 


( 8 ) 


meaning that the knowledge of the state of the system at time s makes this the expected value 
at any later time. The discrete time process defined in Section [43| is a discrete time martingale 
(see Equation (Q). 

Martingales are good models ior fair games: The expected wealth of a player in the future is 
the current wealth, no matter what happened before, or how long has been playing. 

Erom 0 it can be deduced in particular that the expectation of the process is constant in time. 
In our case of the European call option, this means 

Eq [(1 + R)-^S(0] -S(0), 


implying that 

Eg [S(T)] =S(0)-(1 + R)^ 

which is precisely the return of the risk-free bond (and this is why Q is called a "risk neutral" 
probability measure). 

5.9 European call option. Continuous time 

In continuous time, it can be shown that there is also a probability Q under which e^^*S(f) is 
a martingale, and the fair price at time 0 of a call option is given by 

F(0)=Eq [c-«^(S(T)-R)+] , 

although Q is more difficult to describe here. 

The evolution of the value of the bond asset I{t) is driven by the well-known differential 
equation 


dl(t) = R ■ I(t)dt 
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The evolution of the price of the share can be described as 

dS{t) = S{t){]idt + adW{t)) 


( 9 ) 


where W is a Wiener process, approximating (in the continuum limit) the Markov chain 
given by the binomial model. The trend, if p f 1/2, goes to the drift p. The volatility a is the 
intensity of the noise. This is a simple example of a stochastic differential equation. It is a 
pathwise description of a diffusion process with b{t,x) = p and a{t,x) = (see Section 4.8'. 


Although the paths of W are non-differentiable ever}rwhere. Equation Q has the obvious 
meaning 


S{t) = S{0) + u f Sir) dr + crW(f) . 
■JO 


This equation can be solved explicitly (this is not common, of course). The solution is the 
stochastic process given by 


S(f) = S(0) exp |pf — + £rW(f)| , 

and we can compute its law from here. 

The evolution of the whole portfolio value will be 

dV{t) = BtdI{t)+AtdS{t) . 


5.10 Stochastic differential equations 

In general, a diffusion process X with characteristic functions a{t,x) and h(t,x) (called respec¬ 
tively diffusion and drift coefficients) can be represented pathwise by means of the stochastic 
differential equation 

dX{t) = b{t,X{t)) dt + a{t,X{t))^^^ dW{t) , 

with a suitable definition of the last term, which in general, when the function a depends 
effectively of its second argument, does not possess an obvious meaning. 

Diffusions can therefore be studied at the same time with the tools of partial differential equa¬ 
tions that describe the evolution of the laws in time, and with the tools of stochastic processes 
and stochastic differential equations, that provide the evolution of the paths themselves. 

The word "diffusion" is taken from the physical phenomenon with that name: The movement 
of particles in a fluid from regions of high concentration to regions of low concentration. The 
heat "diffuses" in the same way, following the negative gradient of the temperature field 
f{t,x). In one space dimension, it obeys the partial differential equation 

where D is called the thermal diffusivity. Comparing with Kolmogorov equations (|5]j^, we see 
that, with suitable initial conditions, f{t,x) is the density at time t and point v of a diffusion 
process following the stochastic differential equation 

dX{t) = V2DdW{t) , 

that means, essentially, the Wiener process. 
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6 Recommended books 

• Nelson, Stochastic Modeling, Dover 1995 
(Arrivals, queues, Markov chains, simulation) 

• Gross-Harris, Fundamentals of queueing theory, Wiley 1998 
(Queues) 

• Asmussen-Glynn, Stochastic simulation. Springer 2007 
(Simulation) 

• Maruyama, Stochastic problems in population genetics. Springer 1977 
(Diffusions, application to genetics) 

• Lamberton-Lapeyre, Introduction au calcul stochastique applique a la finance. Ellipses 
1997 

(Diffusions, stochastic differential equations, application to finance) 
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Dedication: lo non crollo 



This survey is based on a course given by the author in the Universita degli Studi delTAquila, 
as a part of the Intensive Programme Mathematical Models in Life and Social Sciences, in 
July 2008. 

The year after, on April 6th 2009, the building where the programme took place was destroyed 
by a strong earthquake that caused more than 300 deaths in the region. 

This work is dedicated to the people that died, lost a beloved one, or lost their homes that 
day. I adhere to the motto that helped the university people to carry on after the disaster: 

lo non crollo. 



Photo: Renato di Bartolomeo 
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